An In-Hospital Mortality Risk Model for Elderly Patients Undergoing Cardiac Valvular Surgery Based on LASSO-Logistic Regression and Machine Learning

Background: To preferably evaluate and predict the risk for in-hospital mortality in elderly patients receiving cardiac valvular surgery, we developed a new prediction model using least absolute shrinkage and selection operator (LASSO)-logistic regression and machine learning (ML) algorithms. Methods: Clinical data including baseline characteristics and peri-operative data of 7163 elderly patients undergoing cardiac valvular surgery from January 2016 to December 2018 were collected at 87 hospitals in the Chinese Cardiac Surgery Registry (CCSR). Patients were divided into training (N = 5774 [80%]) and testing samples (N = 1389 [20%]) according to their date of operation. LASSO-logistic regression models and ML models were used to analyze risk factors and develop the prediction model. We compared the discrimination and calibration of each model and EuroSCORE II. Results: A total of 7163 patients were included in this study, with a mean age of 69.8 (SD 4.5) years, and 45.0% were women. Overall, in-hospital mortality was 4.05%. The final model included seven risk factors: age, prior cardiac surgery, cardiopulmonary bypass duration time (CPB time), left ventricular ejection fraction (LVEF), creatinine clearance rate (CCr), combined coronary artery bypass grafting (CABG) and New York Heart Association (NYHA) class. LASSO-logistic regression, linear discriminant analysis (LDA), support vector classification (SVC) and logistic regression (LR) models had the best discrimination and calibration in both training and testing cohorts, which were superior to the EuroSCORE II. Conclusions: The mortality rate for elderly patients undergoing cardiac valvular surgery was relatively high. LASSO-logistic regression, LDA, SVC and LR can predict the risk for in-hospital mortality in elderly patients receiving cardiac valvular surgery well.


Introduction
Valvular heart diseases (VHDs) are some of the most common cardiovascular diseases in China. The number of patients requiring cardiac valvular surgery has been rising as the social structure of the population ages [1,2]. About 270,000 heart valve surgeries are performed each year worldwide, accounting for 20-35% of all cardiovascular surgeries [3]. The mortality rate during postoperative hospitalization is 2.6-6.8% in some high-income countries, such as the United States, and can be as high as 13.29% or more if multiple valvular surgery or combined CABG is performed at the same time [4]. In China, close to 80,000 heart valve surgeries are performed each year, while the postoperative mortality rate is nearly 2.3% [5,6]. The current increased incidence, surgical comorbidities, operative difficulties of VHDs and high risk of postoperative complications and mortality have brought a heavy medical burden on individuals, families and society.
With the increase in the incidence of VHDs and surgeries cases, the attention of surgeons and the mature application of research methods such as logistic regression, a series of risk prediction models were developed and applied to patients undergoing cardiac valvular surgery, such as the Society of Thoracic Surgeons National Cardiac Surgery Database score (STS-NCD score) [7], the European System for Cardiac Operative Risk Evaluation (EuroSCORE) [8,9] and the original Chinese CABG risk model (SinoSCORE) [10], based on the establishment of large databases and the analysis of risk factors for postoperative complications and mortality, but several problems still exist in poor discrimination, accuracy and applicability.
Compared with young and middle-aged patients, elderly patients have a much higher risk of mortality after cardiac valvular surgery [11]. With the increasing age of China's population, the burden of VHDs in elderly patients has increased significantly.
Traditional statistical tools such as LASSO-logistic regression and machine learning are currently the most common methods for predictive model construction, and both have similar tasks that mainly include model parameter inference and data fitting or prediction. However, the focuses of these two studies are different. Statistics is more concerned with the confidence of inference or prediction, while machine learning is more concerned with the predictive effect of the model.
The different emphases also lead to many differences in the methodology of the research. The statistical society is concerned about the distribution of statistics, whether the hypothesis test is significant and whether the model fitting is reasonable. Machine learning is concerned with problems directly related to improving the prediction effect, such as how to design models or objective functions, how to train, how to improve the efficiency of algorithms, etc. We applied statistical methods and machine learning to better predict the risk of mortality in elderly patients undergoing cardiac valvular surgery.
To achieve continuous quality monitoring and improvement in adult cardiac surgery, the Chinese Cardiac Surgery Registry (CCSR) database was established in 2013 [12]. This study intends to identify risk factors after cardiac valvular surgery by analyzing the clinical data of elderly patients included in the CCSR database, construct prediction models using LASSO-logistic regression and machine learning algorithms to provide new ideas for mortality risk assessment.

Study Population
The CCSR database includes data from consecutive patients undergoing cardiac surgery at 87 participating centers located in nearly all provinces and directly controlled municipalities in China [12]. Preoperative risk factors and in-hospital deaths are recorded for each patient.
From January 2016 to December 2018, 7163 elderly patients undergoing cardiac valvular surgery included in the CCSR database were selected for the study, including 3939 males and 3224 females with a mean age of 69.8 (SD 4.5) years ( Figure 1). Inclusion criteria were patients who underwent cardiac valvular surgery, including mitral, tricuspid, aortic and pulmonary valve surgery, combined or uncombined with other cardiovascular surgery; greater than or equal to 65 years of age. Exclusion criteria: incomplete surgical treatment; incomplete medical records. J. Cardiovasc. Dev. Dis. 2023, 9, x FOR PEER REVIEW 3 of 20

Definitions of Parameters
In-hospital mortality in our research was defined as all-cause death occurring between the surgery and hospital discharge.
Definitions of demographics and clinical variables are shown in Table 1.

Definitions of Parameters
In-hospital mortality in our research was defined as all-cause death occurring between the surgery and hospital discharge.
Definitions of demographics and clinical variables are shown in Table 1.

Data Collecting
Recorded variables included clinical, perioperative and laboratory data. All patients were examined using preoperative echocardiography to record left ventricular ejection fraction (LVEF), left ventricular end diastolic diameter (LVEDD), left atrial dimension (LAD) and valvular lesions including aortic stenosis (AS), aortic insufficiency (AI), mitral stenosis (MS), mitral insufficiency (MI), tricuspid stenosis (TS), tricuspid insufficiency (TI), pulmonary stenosis (PS) and pulmonary insufficiency (PI). Candidate risk factors for the model recorded as clinical data included: age, gender, body mass index (BMI), comorbidities, preoperative medications, tobacco, alcohol, nutrition, NYHA classification and Canadian Cardiovascular Society (CCS) angina classification. Perioperative data studied included surgical status, surgical approach, CPB time, aortic cross-clamp (ACC) time, mechanical ventilation time, estimated blood loss, blood transfusion including red blood cell (RBC) and fresh frozen plasma (FFP), hospitalization time, ICU stays and complications. Laboratory results included total cholesterol (TC), low density lipoprotein (LDL), fasted blood glucose (FBG) and serum creatinine (SCr). CCr was calculated using the Cockcroft-Gault formula. Comorbidities included hypertension, diabetes mellitus, dyslipidemia, cerebrovascular accident, chronic kidney disease (CKD), chronic obstructive pulmonary disease (COPD), peripheral vascular disease (PVD), cardiac arrhythmias, coronary artery disease (CAD), prior myocardial infarction (MI), prior percutaneous coronary intervention (PCI), prior heart failure (HF) and prior cardiac surgery. The primary study endpoint was in-hospital mortality, defined as all-cause in-hospital death after cardiac valvular surgery.

Statistical Analysis
Continuous variables were presented as mean ± standard deviation (SD) if they obeyed a normal distribution or median (quartiles) besides and were compared using Student's t-test and Mann-Whitney test. Categorical variables were presented as frequencies (percentages%) and compared using Chi-square and Fisher's exact tests. The Kolmogorov-Smirnov test was adopted for normality testing. All p values were two-tail and p < 0.05 was considered significant. Data were analyzed using SPSS version 26.0 (IBM Corp., Armonk, NY, USA) and GraphPad Prism version 9.3.1 (GraphPad Software, San Diego, CA, USA). R version 4.2.1 "rms","CBCgrps","caret","glmnet" package (R Foundation for Statistical Computing, Vienna, Austria) was used to build the LASSO-logistic prediction model and Python version 3.10 (Python Software Foundation, Wilmington DE, USA) was used to build the machine learning models. At the traditional statistical analysis level, we replaced the columns with continuous values in the dataset with the average value of the remaining values in the column. If the missing value came from the classification column (string or value), we replaced the missing value with the most common category. For data variables with longitudinal behavior (time variables in this study), the last valid observation value was used to fill in the missing value.
At the level of machine learning, this study first discarded samples with missing values greater than 20%, which means that when a patient's clinical data have 20% or more missing, we excluded them from the data set. Secondly, we used the deep learning algorithm to predict the remaining missing values and the filling method of Datawig database (https://github.com/awslabs/datawig) (accessed on 21 April 2021).
In training set, LASSO regression was used to screen variables. We utilized tenfold cross-validation to select the penalty term, λ. The binomial deviance was computed for the test data as measures of the predictive performance of the fitted models. Logistic regression analysis was used with the "Forward LR" method. Subsequently, we constructed a nomogram based on the logistic regression model using the "rms" package. According to the results of the regression, multiple line segments were drawn according to a specific proportion, and the incidence risk or survival probability of an individual was easily calculated by making the graph.
In addition, ML prediction models were established using 11 algorithms: Adaboost, BernoulliNB, decision tree (DT), gradient boosting (GB), K-nearest neighbor (KNN), linear discriminant analysis (LDA), support vector classification (SVC) and logistic regression (LR), random forest (RF), stochastic gradient descent (SGD) and extreme gradient boosting (XGBoost). Before this, the dimension of extracted features was reduced using the Select K Best (K-Best) algorithm and traditional least absolute shrinkage and the selection operator (LASSO) algorithm. Finally, the relevant parameters of these selected features were used to build a machine learning model. We used area under the curve (AUC) to evaluate the discriminatory performance of the model, the Hosmer-Lemeshow goodness-of-fit test, calibration curve and brier score to assess model calibration. In addition, EuroSCORE II model was used to compare prediction performance with our model through Delong test.
In-hospital mortality in the aggregate cohort was 4.05% (290/7163). With regard to preoperative factors, patients dead and non-dead differed statistically significantly according to age, tobacco, dyslipidemia, cerebrovascular accident, prior heart failure, NYHA and CCS classification, cardiac arrhythmias, prior myocardial infarction, prior cardiac surgery, SCr, CCr, TC, LDL, FBG, LVEF, MS, MI grades and TI grades. As for intraoperative characters, mortality was significantly associated with CPB time, ACC time, combined CABG and blood transfusion (p < 0.05).

Screening Results of Variables of the Prediction Models
According to the result of single-factor analysis shown (p < 0.1 was considered partly significant and these variables were chosen) and possible clinical significance, screening factors for prediction models were as follows: age, gender, BMI, tobacco use, hypertension, diabetes mellitus, dyslipidemia, COPD, PVD, prior cerebrovascular accident, prior HF, CCS class, NYHA class, atrial flutter/atrial fibrillation, prior MI, prior cardiac surgery, SCr, CCr, TC, LDL, FBG, LVEF, LVEDD, CPB time, ACC time, combined CABG, etc.

Establishment of the LASSO-Logistic Regression Prediction Model
In the training sample, in-hospital mortality was used as the dependent variable, the variables were screened using the LASSO regression algorithm, and the best λ value was selected by 10 folds cross-validation (Figures 2 and 3). The 2 dashed lines in Figure 2 indicate lambda.min and lambda.1se, respectively. Lambda.min denoted the value of λ when the model error was minimal. Lambda.1se denoted the model error within a standard error range of λ. At this point, the fit was guaranteed while incorporating the least number of variables to obtain the most streamlined prediction model. LASSO regression screening variables were age, LVEF, combined CABG, CCr, prior cardiac surgery, CPB time and NYHA class (Table 3).

Screening Results of Variables of the Prediction Models
According to the result of single-factor analysis shown (p < 0.1 was consider significant and these variables were chosen) and possible clinical significance, s factors for prediction models were as follows: age, gender, BMI, toba hypertension, diabetes mellitus, dyslipidemia, COPD, PVD, prior cerebro accident, prior HF, CCS class, NYHA class, atrial flutter/atrial fibrillation, prior cardiac surgery, SCr, CCr, TC, LDL, FBG, LVEF, LVEDD, CPB time, ACC time, c CABG, etc.

Establishment of the LASSO-Logistic Regression Prediction Model
In the training sample, in-hospital mortality was used as the dependent var variables were screened using the LASSO regression algorithm, and the best λ v selected by 10 folds cross-validation (Figures 2 and 3). The 2 dashed lines in indicate lambda.min and lambda.1se, respectively. Lambda.min denoted the v when the model error was minimal. Lambda.1se denoted the model error standard error range of λ. At this point, the fit was guaranteed while incorpor least number of variables to obtain the most streamlined prediction model regression screening variables were age, LVEF, combined CABG, CCr, prio surgery, CPB time and NYHA class ( Table 3).    In the training data, the AUC of the LASSO-logistic regression prediction model wa 0.785 (95% CI:0.746-0.824) (Figure 4a). The Hosmer-Lemeshow χ 2 was 10.731, p value wa 0.217 and the calibration curve of the prediction model was close to the standard curv (Figure 5a), suggesting that the model predicted the risk of in-hospital mortality aft heart valve surgery with high accuracy and good agreement with the actual risk occurrence. The AUC of the EuroSCORE II prediction model was 0.627 (95%CI: 0.582 0.672).  The LASSO-logistic model equation is Logit p = −5.669 + 0.036 × age(years) + 0.928 × prior cardiac surgery − 0.026 × LVEF(%) + 0.01 × CPB time(min) + 0.389 × combined CABG + 0.328 × NYHA class − 0.021 × CCr(mL/min/1.73 m 2 ).
In the training data, the AUC of the LASSO-logistic regression prediction model was 0.785 (95% CI:0.746-0.824) (Figure 4a). The Hosmer-Lemeshow χ 2 was 10.731, p value was 0.217 and the calibration curve of the prediction model was close to the standard curve (Figure 5a), suggesting that the model predicted the risk of in-hospital mortality after heart valve surgery with high accuracy and good agreement with the actual risk of occurrence. The AUC of the EuroSCORE II prediction model was 0.627 (95%CI: 0.582-0.672).   In the testing sample, the AUC of this model was 0.739 (95% CI:0.673-0.805) ( Figure  4b). The Hosmer-Lemeshow χ 2 was 6.64, p value was 0.576 and the calibration curve of the prediction model was close to the standard curve (Figure 5b). The AUC of the EuroSCORE II prediction model in testing sample was 0.642 (95%CI: 0.562-0.722). The LASSO-logistic regression model outperformed the EuroSCORE II prediction model in In the testing sample, the AUC of this model was 0.739 (95% CI:0.673-0.805) (Figure 4b). The Hosmer-Lemeshow χ 2 was 6.64, p value was 0.576 and the calibration curve of the prediction model was close to the standard curve (Figure 5b). The AUC of the EuroSCORE II prediction model in testing sample was 0.642 (95%CI: 0.562-0.722). The LASSO-logistic regression model outperformed the EuroSCORE II prediction model in terms of discrimination and calibration.
Variables from the LASSO-logistic regression model were utilized to build a nomogram model ( Figure 6). In the case of this patient, aged 75 years, with no prior cardiac surgery, a preoperative LVEF of 66.0%, a CPB time of 181 min, non-combined CABG, a preoperative CCr of 55.1 mL/min/1.73 m 2 and NYHA class III, a risk score of 260 points and a risk of mortality of 4.85% could be calculated according to the nomogram model (Figure 7). ev. Dis. 2023, 9, x FOR PEER REVIEW Figure 6. Nomogram prediction model of in-hospital mortality for elderly patients cardiac valvular surgery. Each variable corresponds to a risk score, and the total score by summing all scores to the corresponding risk of mortality. Figure 6. Nomogram prediction model of in-hospital mortality for elderly patients undergoing cardiac valvular surgery. Each variable corresponds to a risk score, and the total score is obtained by summing all scores to the corresponding risk of mortality. Figure 6. Nomogram prediction model of in-hospital mortality for elderly patients undergoing cardiac valvular surgery. Each variable corresponds to a risk score, and the total score is obtained by summing all scores to the corresponding risk of mortality.

Establishment of ML Prediction Models
In the training data, model establishments and internal validation were performed using the 10 folds cross validation method. The ROC is shown in Figures 8 and 9 and the Brier score is shown in Figure 10. The results showed that LDA, SVC and LR had the largest AUC and the lower Brier score, which suggests superior discrimination and calibration of other ML prediction models. In addition, we observed a larger AUC of the nomogram than other ML models and a similar Brier score to the LDA, SVC and LR models.

Establishment of ML Prediction Models
In the training data, model establishments and internal validation were performed using the 10 folds cross validation method. The ROC is shown in Figures 8 and 9 and the Brier score is shown in Figure 10. The results showed that LDA, SVC and LR had the largest AUC and the lower Brier score, which suggests superior discrimination and calibration of other ML prediction models. In addition, we observed a larger AUC of the nomogram than other ML models and a similar Brier score to the LDA, SVC and LR models.     In the testing cohort, we externally validated the nomogram model and the machine learning algorithms such as LDA, SVC and LR with the best prediction performances in the training cohort (Figures 11 and 12). Four prediction models were found to have satisfying discrimination and calibration, demonstrating their promising application in assessing the mortality risk of elderly patients after cardiac valvular surgery. In the testing cohort, we externally validated the nomogram model and the machine learning algorithms such as LDA, SVC and LR with the best prediction performances in the training cohort (Figures 11 and 12). Four prediction models were found to have satisfying discrimination and calibration, demonstrating their promising application in assessing the mortality risk of elderly patients after cardiac valvular surgery.
In the testing cohort, we externally validated the nomogram model and the machine learning algorithms such as LDA, SVC and LR with the best prediction performances in the training cohort (Figures 11 and 12). Four prediction models were found to have satisfying discrimination and calibration, demonstrating their promising application in assessing the mortality risk of elderly patients after cardiac valvular surgery. Figure 11. AUCs of nomogram and ML prediction models in testing cohort. Figure 11. AUCs of nomogram and ML prediction models in testing cohort.

Discussion
VHDs are increasingly becoming an important public health problem and have a higher overall prevalence and percentage of patients requiring surgical treatment [1,2]. Degenerative and functional lesions are the main causes in high-income countries, while rheumatic lesions are still the main cause in low-and middle-income countries [1]. Shengshou Hu et. al. released an update of the Annual Report on Cardiovascular Health and Disease in China 2021, showing that the number of cardiovascular diseases in China is about 330 million, of which the prevalence of VHDs is 3.8%, with about 25 million people affected [6]. The establishment of a cardiac valvular surgery risk assessment model is of great significance for the accurate identification of high-risk cases, adequate assessment of surgical risks, targeted perioperative management and comprehensively improving the diagnosis and treatment level.
The prevalence of VHDs increases significantly with age, with a prevalence of about 0.7-2.1% in people aged 18-54 years and can be as high as 7.6-15.9% in elderly patients

Discussion
VHDs are increasingly becoming an important public health problem and have a higher overall prevalence and percentage of patients requiring surgical treatment [1,2]. Degenerative and functional lesions are the main causes in high-income countries, while rheumatic lesions are still the main cause in low-and middle-income countries [1]. Shengshou Hu et. al. released an update of the Annual Report on Cardiovascular Health and Disease in China 2021, showing that the number of cardiovascular diseases in China is about 330 million, of which the prevalence of VHDs is 3.8%, with about 25 million people affected [6]. The establishment of a cardiac valvular surgery risk assessment model is of great significance for the accurate identification of high-risk cases, adequate assessment of surgical risks, targeted perioperative management and comprehensively improving the diagnosis and treatment level.
The prevalence of VHDs increases significantly with age, with a prevalence of about 0.7-2.1% in people aged 18-54 years and can be as high as 7.6-15.9% in elderly patients aged 65 years or elder [13]. With the increasing age of China's population, the burden of VHDs and the number of patients requiring surgical treatment, especially elderly patients, has increased significantly. Compared with young and middle-aged patients, elderly patients have a much higher prevalence of comorbidities such as hypertension, diabetes mellitus, cerebrovascular disease, chronic kidney disease and atrial fibrillation, a weaker ability to tolerate cardiopulmonary bypass surgery, more complex pathologic anatomy such as valve annulus calcification, leaflet thickening and leaflet prolapse, a higher risk of postoperative death and related complications [13,14]. The perioperative management of elderly patients undergoing cardiac valvular surgery requires more clinical attention, and there is a lack of corresponding mortality risk prediction models and studies. Therefore, this study aims to establish a risk prediction model for postoperative mortality in elderly patients aged 65 years or older and to provide more accurate treatment plans for them.
The high postoperative mortality rate in elderly patients undergoing cardiac valvular surgery has been of great concern to cardiothoracic surgeons. The in-hospital mortality rate in the aggregate cohort was approximately 4.05% in our research, which was significantly higher than the overall population mortality rate (2.16-2.64%) and slightly better than the 8-20% mortality rate reported in previous studies [13,15]. Yoshida et al. found that the in-hospital mortality rate was 9.6% in patients 65 years or older, whereas in patients younger than 65 years, the rate was 3.2% [15]. The analysis by Susheel K. Kodali et al. found that in-hospital mortality in patients over 80 years undergoing cardiac valvular surgery could be as high as 20%, which may be higher if multiple valves were involved or combined with CABG surgery [13]. In our study, the mean age of the patients included was 69.8 years and 4470 patients (62.4%) were in the 65-70 range. Thus, the low mortality rate may be due to the improvement of large cardiac centers. On the one hand, Zhan Hu et al. found that advances in cardiac surgical care and the Chinese government's efforts to improve population health have helped to decrease surgical mortality over time and existing risk models may now overestimate surgical mortality and poorly discriminate between patients [16]. On the other hand, patients included in our study were not too old and had a relatively low risk of mortality. In the study from Zhuge, 45.29% of patients over 60 years with mitral valve insufficiency were not treated surgically, compared with 10% in patients over 80 years. Older age, impaired LVEF, lower regurgitation grade, EuroSCORE II high risk stratification and having diabetes were factors most significantly associated with surgery denial among elderly Chinese inpatients with MR [17]. For elderly patients, surgical decisions are made more cautiously.
Risk factors for mortality after cardiac valvular surgery have been a focus and hot topic in cardiovascular surgery research. Previous studies have shown that plenty of risk factors seemed to be strong factors associated with mortality such as age, LVEF, BMI, renal function, combined CABG, NYHA class, etc. [7,9,18]. Seven variables such as age, LVEF, CCr, etc., were included in our nomogram model, which maximized the simplicity and efficiency. Compared with previous studies, our model included two relatively new variables with less incorporation of previous predictive models.
First, our study showed that prior cardiac surgery was one of the independent risk factors (OR = 2.529, 95% CI = 1.572-4.070, p < 0.001). For patients overall, the proportion of patients with prior cardiac surgery was 5.5% and it was 12.1% of patients in the death group. Most patients had undergone valvular surgery, with a small percentage having undergone CABG. In 2004, Nowicki et al. analyzed data from the NNECDSG database of 8943 cardiac valvular surgery patients from 1991 to 2001, and applied logistic regression analysis to develop a prediction model for the risk of in-hospital mortality for aortic and mitral valve surgery, incorporating prior cardiac surgery as one risk factor [19]. Because of the severe pleuropericardial adhesions resulting from the initial procedure, reoperation is related to greater difficulty, more blood loss, longer operative time and greater risk of cardiac injury, with the resulting increased risk of postoperative death being easily understood. With improved overall life expectancy and survival after cardiac surgery, the persistence of coronary artery disease, the extensive use of bioprosthetic heart valves and the rapid evolution of mitral valvuloplasty, the number of patients undergoing cardiac reoperation increases continuously. Lin H. et al. found that the proportion of mitral valvuloplasty in the CCSR database showed a rapid increase, rising 11.9% over three years [20]. It reminds us of the need to pay more attention to patients undergoing reoperative cardiac surgery, with adequate preoperative evaluation and preparation, intraoperative use of applicable accesses, intubation and more appropriate surgical equipment to reduce the risk of complications and mortality [1].
Second, the prediction model incorporated the lesser studied CPB time, which is an important controllable factor in valvular surgery. The mean CPB time was 196.4 min in the death group, while it was 126.4 min in the non-death group; every minute counts when the aortic was clamped during surgery (OR = 1.010, 95% CI: 1.009-1.012, p < 0.001). Stefano Salis et al. analyzed the data of 5006 patients undergoing CPB surgery and found that the CPB time was an independent risk factor for postoperative death (OR = 1.57, 95% CI:1.43-1.73, p < 0.0001) [21]. In addition, the increased CPB time leads to a significant increase in the incidence of complications such as renal, respiratory and neurological complications and multi-organ dysfunction and multiple blood transfusions. CPB time is currently considered to be associated with the activation of the inflammatory cascade response caused by the release of various inflammatory mediators, leading to an increased risk of organ dysfunction and mortality [22][23][24]. On the other hand, increased CPB time implies prolonged operative time and assist circulation time, which can partly reflect the difficulty of surgery and the more critical condition of the patient.
In this study, we used LASSO-logistic regression and machine learning analysis for modeling. The constructed prediction model outperformed the traditional EuroSCORE II in the training and testing cohort in terms of discrimination and calibration (p < 0.05) and was more suitable for the assessment of mortality risk in elderly patients after cardiac valvular surgery. This study has the following advantages over the EuroSCORE II: 1. previous prediction models such as STS-NCD score and EuroSCORE are based on data from Western populations [7,9], and there may be significant differences in disease features, therapeutic strategies and surgical techniques in different regions [7]. Our study was supported by the CCSR database. Established in 2013, CCSR is a nationwide multicenter registry that provides a platform for risk assessment, outcome evaluation and quality improvement for adult cardiac operations in mainland China with the advantages of a border range of participating centers, strong representation and high-quality data [12,16]. Our study reflected the cardiac surgical levels in China better and was more accurate in predicting the risk of mortality after valve surgery. 2. Based on clinical data from patients more than 10 years ago, Nashef et al. proposed the EuroSCORE I scoring system in 1999 [8], which was updated to EuroSCORE II in 2012 [9]. As cardiac surgery technology has improved and disease characteristics have changed over the past decade, the risks and predictive variables of mortality from valve surgery in previous studies are no longer representative of current clinical practice. 3. A total of 18 predictive variables are included in EuroSCORE II [9], while seven variables are included in our prediction model, which is easier to use. Overall, our predictive model may serve as a better assessment tool for evaluating the risk of mortality after cardiac valvular surgery in the Chinese elderly population, and its potential applicability could be investigated in other regions or populations in the future.
Machine learning has been a cutting-edge interdisciplinary study direction in recent years and is increasingly used in clinical settings [25,26]. Machine learning has unique advantages in handling large sample data, complex data and personalized assessments [27]. ML approaches have been applied in medical fields as an emerging technological means, such as imaging diagnosis, pharmaceutical research and the establishment of prediction models. A series of studies have already applied machine learning algorithms to the field of risk prediction for cardiac surgery, including mortality and postoperative complications such as acute kidney injury, myocardial infarction and readmission [28][29][30][31]. Allyn et al. found that machine learning algorithms were far more accurate than EuroSCORE II and logistic regression in predicting in-hospital mortality after elective cardiac surgery [32]. Machine learning involves a series of algorithms, and different algorithms have different learning ways and application scenarios, so we need to evaluate the prediction performance of each machine algorithm and select the most suitable prediction model. Our study found that the application of LDA, SV and LR algorithms for postoperative mortality risk assessment demonstrated excellent predictive performance by comparing AUC values and Brier scores. We introduced machine learning algorithms for a preliminary analysis to explore their great potential for risk assessment. The advantages of machine learning algorithms may become more apparent in the future as the research continues and the sample size, as well as the collection of variables, expands.

Limitations
Our study has certain limitations, including the following aspects: first, our primary outcome was in-hospital mortality after valvular surgery due to the limitations of the CCSR database, rather than 30-day mortality, which was widely used. Second, the LASSO-logistic regression model incorporates an intraoperative variable of CPB time and is therefore limited in clinical application. Third, our study is limited by the lack of follow-up data on survival and other primary outcomes, and we await more research and spread in the future.

Conclusions
The mortality rate for elderly patients undergoing cardiac valvular surgery was relatively high. LASSO-logistic regression, LDA, SVC and LR can predict the risk for in-hospital mortality in elderly patients receiving cardiac valvular surgery well.